DMS: A Parallel Data Mining Server

نویسنده

  • Felicity A. W. George
چکیده

Tandem’s Data Mining Server (DMS) is a parallel data engine designed to enable data mining tools to store, access and analyse high volumes of data very efficiently. In contrast to traditional database management systems, data structures are optimized for analysis and pattern recognition, rather than for accessing individual rows. The data stored in DMS tables are encoded automatically, so the required disk space is typically 3 to 5 times less than the raw data size. This approach not only minimizes disk space and disk access time, it enables the majority of processing to be done in memory.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Parallel Data Mining Architecture for Massive Data Sets

This paper discusses a parallel data mining architecture which provides the capability to mine massive data sets highly efficiently, scanning millions of rows of data per second. In this architecture the mining process is divided into two distinct components. A parallel server, Compaq’s Data Mining Server (DMS), provides a set of data mining primitives which are utilized by a data mining client...

متن کامل

Data Mining: a Database Perspective

Data mining on large databases has been a major concern in research community, due to the di culty of analyzing huge volumes of data using only traditional OLAP tools. This sort of process implies a lot of computational power, memory and disk I/O, which can only be provided by parallel computers. We present a discussion of how database technology can be integrated to data mining techniques. Fin...

متن کامل

Data Mining: a Database Perspective

Data mining on large databases has been a major concern in research community , due to the diiculty of analyzing huge volumes of data using only traditional OLAP tools. This sort of process implies a lot of computational power, memory and disk I/O, which can only be provided by parallel computers. We present a discussion of how database technology can be integrated to data mining techniques. Fi...

متن کامل

A Genetic Programming Framework for Two Data Mining Tasks: Classification and Generalized Rule Induction

This paper proposes a genetic programming (GP) framework for two major data mining tasks, namely classification and generalized rule induction. The framework emphasizes the integration between a GP algorithm and relational database systems. In particular, the fitness of individuals is computed by submitting SQL queries to a (parallel) database server. Some advantages of this integration from a ...

متن کامل

Data-Mining: A Tightly-Coupled Implementation on a Parallel Database Server

Due to the increasingly di culty of discovering patterns in real-world databases using only conventional OLAP tools, an automated process such as data mining is currently essential. As data mining over large data sets can take a prohibitive amount of time related to the computational complexity of the algorithms, parallel processing has often been used as a solution. However, when data does not...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998